Safe Q-Learning on Complete History Spaces

نویسندگان

  • Stephan Timmer
  • Martin A. Riedmiller
چکیده

In this article, we present an idea for solving deterministic partially observable markov decision processes (POMDPs) based on a history space containing sequences of past observations and actions. A novel and sound technique for learning a Q-function on history spaces is developed and discussed. We analyze certain conditions under which a history based approach is able to learn policies comparable to the optimal solution on belief states. The algorithm presented is model-free and can be combined with any method learning history spaces. We also present a procedure able to learn history spaces especially suited for our Q-learning algorithm.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ON Q-BITOPOLOGICAL SPACES

We study here $T_{0}$-$Q$-bitopological spaces and sober $Q$-bitopological spaces and their relationship with two particular Sierpinski objects in the category of $Q$-bitopological spaces. The epireflective hulls of both these Sierpinski objects in the category of $Q$-bitopological spaces turn out to be the category of $T_0$-$Q$-bitopological spaces. We show that only one of these Sierpinski ob...

متن کامل

FORMAL BALLS IN FUZZY PARTIAL METRIC SPACES

In this paper, the poset $BX$ of formal balls is studied in fuzzy partial metric space $(X,p,*)$. We introduce the notion of layered complete fuzzy partial metric space and get that the poset $BX$ of formal balls is a dcpo if and only if $(X,p,*)$ is layered complete fuzzy partial metric space.

متن کامل

Reinforcement Learning with External Knowledge and Two-Stage Q-functions for Predicting Popular Reddit Threads

This paper addresses the problem of predicting popularity of comments in an online discussion forum using reinforcement learning, particularly addressing two challenges that arise from having natural language state and action spaces. First, the state representation, which characterizes the history of comments tracked in a discussion at a particular point, is augmented to incorporate the global ...

متن کامل

Cartesian-closedness of the category of $L$-fuzzy Q-convergence spaces

The definition of $L$-fuzzy Q-convergence spaces is presented by Pang and Fang in 2011. However, Cartesian-closedness of the category of $L$-fuzzy Q-convergence spaces is not investigated. This paper focuses on Cartesian-closedness of the category of $L$-fuzzy Q-convergence spaces, and it is shown that  the category $L$-$mathbf{QFCS}$ of $L$-fuzzy Q-convergence spaces is Cartesian-closed.

متن کامل

Evaluating project’s completion time with Q-learning

Nowadays project management is a key component in introductory operations management. The educators and the researchers in these areas advocate representing a project as a network and applying the solution approaches for network models to them to assist project managers to monitor their completion. In this paper, we evaluated project’s completion time utilizing the Q-learning algorithm. So the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007